{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "5acc9d7d-1b27-4641-b6ce-c1a5a4769e90",
   "metadata": {},
   "source": [
    "# Tutorial: Running populations with binary_c-python with a source file\n",
    "This notebook will show you how to evolve a population of stars through a source file that contains a set of pre-determined systems. \n",
    "\n",
    "To enable source file sampling we need to configure the population object with `evolution_type=\"source_file\"` and we need to provide a filename that points to the source file, i.e. `source_file_sampling_filename=source_file_sampling_filename`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "9a46fc3e-9a5e-4923-83e6-74f115029439",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "from binarycpython.utils.functions import temp_dir, output_lines\n",
    "from binarycpython import Population\n",
    "\n",
    "TMP_DIR = temp_dir(\"notebooks\", \"notebook_source_file\", clean_path=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6913f0a9-2736-45e3-bcd6-f0c5f47399c7",
   "metadata": {},
   "source": [
    "We will first set up a Population object with the correct configuration for source file sampling, and add some parsing function and a custom logging routine. Please note, the custom logging and parsing of the output of binary_c is currently a very simple example. Actual use-cases are more complex in their data handling. They are merely intended to support the show-case for source file sampling. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "8c18ce20-67e7-4195-9719-059a4a9c1d2a",
   "metadata": {},
   "outputs": [],
   "source": [
    "source_file_pop = Population(\n",
    "    tmp_dir=TMP_DIR,\n",
    "    evolution_type=\"source_file\",\n",
    "    num_cores=1\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "d9850a94-85ac-42b3-8d44-607538ad4474",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Create custom logging statement\n",
    "custom_logging_code = \"\"\"\n",
    "Printf(\"EXAMPLE_SOURCE_FILE_LOGGING %30.12e %g %g %g %d\\\\n\",\n",
    "    // \n",
    "    stardata->model.time, // 1\n",
    "    stardata->star[0].mass, // 2\n",
    "    stardata->common.zero_age.mass[0], // 3\n",
    "    stardata->model.probability, // 4\n",
    "    stardata->star[0].stellar_type // 5\n",
    ");\n",
    "\"\"\"\n",
    "\n",
    "source_file_pop.set(\n",
    "    C_logging_code=custom_logging_code\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "7f360897-6f3c-4a8d-87e2-ab0e06070e1f",
   "metadata": {},
   "outputs": [],
   "source": [
    "def parse_function(self, output):\n",
    "    \"\"\"\n",
    "    Example parse function\n",
    "    \"\"\"\n",
    "\n",
    "    parameters = [\"time\", \"mass\", \"zams_mass\", \"probability\", \"stellar_type\"]\n",
    "\n",
    "    # Go over the output.\n",
    "    for line in output_lines(output):\n",
    "        headerline = line.split()[0]\n",
    "\n",
    "        # CHeck the header and act accordingly\n",
    "        if headerline == \"EXAMPLE_SOURCE_FILE_LOGGING\":\n",
    "            values = line.split()[1:]\n",
    "    \n",
    "            # Check if the length matches the expected length\n",
    "            if not len(parameters) == len(values):\n",
    "                print(\"Number of column names isnt equal to number of columns\")\n",
    "                raise ValueError\n",
    "    \n",
    "            # print some info\n",
    "            value_dict = {key: float(value) for key, value in zip(parameters, values)}\n",
    "\n",
    "    # To prevent filling the notebook with each timestep, lets just print one thing at the end. The purpose of this example is to show how things work.\n",
    "    print(value_dict)\n",
    "            \n",
    "                \n",
    "# Add the parsing function\n",
    "source_file_pop.set(\n",
    "    parse_function=parse_function,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "81f1dfa3-eaa9-4f09-9353-16444f6c375f",
   "metadata": {},
   "source": [
    "## File content/format\n",
    "The sampling from source file method allows for two different types of files. The choice for the type of file is controlled via the option `source_file_sampling_type`, and the options are `command` and `column`"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4026ca1-6a2c-4430-af8f-30a1fa3da893",
   "metadata": {
    "tags": []
   },
   "source": [
    "### command based\n",
    "This is type of source file should contain lines that are like command line commands for `binary_c`, i.e. a sequence of key + value pairs, optionally prepended by `binary_c`, e.g.:\n",
    "\n",
    "```\n",
    "binary_c M_1 10 M_2 5 orbital_period 10000\n",
    "binary_c M_1 1 M_2 0.5 orbital_period 1000 metallicity 0.001\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "66e853d7-7ee8-48c7-918d-26ff0d275bee",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "setting up the system_queue_filler now\n",
      "Loading source file from /tmp/binary_c_python-david/notebooks/notebook_source_file/example_command_based_sourcefile.txt\n",
      "Source file loaded\n",
      "Signalling processes to stop\n",
      "{'time': 15000.0, 'mass': 1.33478, 'zams_mass': 10.0, 'probability': 1.0, 'stellar_type': 13.0}\n",
      "{'time': 15000.0, 'mass': 0.602425, 'zams_mass': 1.0, 'probability': 1.0, 'stellar_type': 11.0}\n",
      "\n",
      "****************************************************\n",
      "*                Process 0 finished:               *\n",
      "*  generator started at 2023-05-18T17:57:54.917371 *\n",
      "* generator finished at 2023-05-18T17:57:55.331367 *\n",
      "*                   total: 0.41s                   *\n",
      "*           of which 0.36s with binary_c           *\n",
      "*                   Ran 2 systems                  *\n",
      "*           with a total probability of 2          *\n",
      "*         This thread had 0 failing systems        *\n",
      "*       with a total failed probability of 0       *\n",
      "*   Skipped a total of 0 zero-probability systems  *\n",
      "*                                                  *\n",
      "****************************************************\n",
      "\n",
      "\n",
      "**********************************************************\n",
      "*  Population-5407e244c9ee45b2848f4cb2dce32b03 finished! *\n",
      "*               The total probability is 2.              *\n",
      "*  It took a total of 0.80s to run 2 systems on 1 cores  *\n",
      "*                   = 0.80s of CPU time.                 *\n",
      "*              Maximum memory use 170.023 MB             *\n",
      "**********************************************************\n",
      "\n",
      "No failed systems were found in this run.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'population_id': '5407e244c9ee45b2848f4cb2dce32b03',\n",
       " 'evolution_type': 'source_file',\n",
       " 'failed_count': 0,\n",
       " 'failed_prob': 0,\n",
       " 'failed_systems_error_codes': [],\n",
       " 'errors_exceeded': False,\n",
       " 'errors_found': False,\n",
       " 'total_probability': 2,\n",
       " 'total_count': 2,\n",
       " 'start_timestamp': 1684429074.8716478,\n",
       " 'end_timestamp': 1684429075.6765742,\n",
       " 'time_elapsed': 0.8049263954162598,\n",
       " 'total_mass_run': 16.5,\n",
       " 'total_probability_weighted_mass_run': 16.5,\n",
       " 'zero_prob_stars_skipped': 0}"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create an example source file with systems.\n",
    "example_command_based_sourcefile = os.path.join(TMP_DIR, 'example_command_based_sourcefile.txt')\n",
    "\n",
    "with open(example_command_based_sourcefile, 'w') as f:\n",
    "    f.write(\"\"\"binary_c M_1 10 M_2 5 orbital_period 10000\n",
    "binary_c M_1 1 M_2 0.5 orbital_period 1000 metallicity 0.001\"\"\")\n",
    "\n",
    "# Run the population\n",
    "source_file_pop.set(\n",
    "    source_file_sampling_filename=example_command_based_sourcefile,\n",
    "    source_file_sampling_type='command'\n",
    ")\n",
    "\n",
    "# evolve population\n",
    "source_file_pop.evolve()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9f901cef-a886-4588-8919-6d3767df120b",
   "metadata": {},
   "source": [
    "Alright. That worked well! Please note that some of the analytics dict output is not valid/appropriate here (e.g. `total_probability_weighted_mass_run`) because we do not use actual probability distribution functions. \n",
    "\n",
    "Let's try the column based sampling next."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c2fb91d5-64c0-4b17-ac55-8341e6507cfd",
   "metadata": {},
   "source": [
    "### Column based\n",
    "This type of source file should start with a header line that indicates which parameter is stored in which header. The subsequent lines should only contain the values of the corresponding parameters, e.g.:\n",
    "\n",
    "```\n",
    "M_1 M_2 orbital_period\n",
    "10 5 1\n",
    "1 0.5 1000\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "d7a43037-2752-4ee0-9afa-e80088a6413d",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "setting up the system_queue_filler now\n",
      "Loading source file from /tmp/binary_c_python-david/notebooks/notebook_source_file/example_command_based_sourcefile.txt\n",
      "Source file loaded\n",
      "Signalling processes to stop\n",
      "{'time': 15000.0, 'mass': 0.680111, 'zams_mass': 2.0, 'probability': 1.0, 'stellar_type': 11.0}\n",
      "{'time': 15000.0, 'mass': 0.7, 'zams_mass': 0.7, 'probability': 1.0, 'stellar_type': 0.0}\n",
      "\n",
      "****************************************************\n",
      "*                Process 0 finished:               *\n",
      "*  generator started at 2023-05-18T17:57:55.737947 *\n",
      "* generator finished at 2023-05-18T17:57:56.064031 *\n",
      "*                   total: 0.33s                   *\n",
      "*           of which 0.27s with binary_c           *\n",
      "*                   Ran 2 systems                  *\n",
      "*           with a total probability of 2          *\n",
      "*         This thread had 0 failing systems        *\n",
      "*       with a total failed probability of 0       *\n",
      "*   Skipped a total of 0 zero-probability systems  *\n",
      "*                                                  *\n",
      "****************************************************\n",
      "\n",
      "\n",
      "**********************************************************\n",
      "*  Population-680d1fe0533f4e6f9ce1ea5794b07e9b finished! *\n",
      "*               The total probability is 2.              *\n",
      "*  It took a total of 0.72s to run 2 systems on 1 cores  *\n",
      "*                   = 0.72s of CPU time.                 *\n",
      "*              Maximum memory use 172.641 MB             *\n",
      "**********************************************************\n",
      "\n",
      "No failed systems were found in this run.\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "{'population_id': '680d1fe0533f4e6f9ce1ea5794b07e9b',\n",
       " 'evolution_type': 'source_file',\n",
       " 'failed_count': 0,\n",
       " 'failed_prob': 0,\n",
       " 'failed_systems_error_codes': [],\n",
       " 'errors_exceeded': False,\n",
       " 'errors_found': False,\n",
       " 'total_probability': 2,\n",
       " 'total_count': 2,\n",
       " 'start_timestamp': 1684429075.7047536,\n",
       " 'end_timestamp': 1684429076.4274688,\n",
       " 'time_elapsed': 0.7227151393890381,\n",
       " 'total_mass_run': 4.2,\n",
       " 'total_probability_weighted_mass_run': 4.2,\n",
       " 'zero_prob_stars_skipped': 0}"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# create an example source file with systems.\n",
    "example_command_based_sourcefile = os.path.join(TMP_DIR, 'example_command_based_sourcefile.txt')\n",
    "\n",
    "with open(example_command_based_sourcefile, 'w') as f:\n",
    "    f.write(\"\"\"M_1 M_2 orbital_period\n",
    "2 1 1\n",
    "0.7 0.5 1000\"\"\")\n",
    "\n",
    "# Run the population\n",
    "source_file_pop.set(\n",
    "    source_file_sampling_filename=example_command_based_sourcefile,\n",
    "    source_file_sampling_type='column'\n",
    ")\n",
    "\n",
    "# evolve population\n",
    "source_file_pop.evolve()\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ffc872ed-5757-4b3c-ae5e-6052cdf5b37b",
   "metadata": {},
   "source": [
    "That works fine too!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}